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1,  Abstract 

The  goal  of  this  research  program  was  to  develop  and  test  a  model 
of  proficient  performance.  Three  aspects  of  proficient  performance  were 
studied:  recognitional  capacities,  perceptual  learning,  and  the  use  of 


analogies. 


Weitrenfeld  (1977,  1981?,)  demonstrated  on  theoretical  grounds  that 
knowing  how  to  perform  a  task  depends  on  recognltlonal  capacity  and  cannot  be 
reduced  to  knowledge  of  rules  or  procedures.  This  theoretical  analysis 
complements  the  empirical  demonstration  of  the  existence  of  special  recognltlonal 
capacities  In  experts,  notably  by  Simon  and  colleagues  (1973,  1980).  We 
examined  the  hypothesis  that  the  possession  of  these  recognltlonal  capacities 
is  the  driving  force  In  the  superiority  of  chess  masters  over  chess  experts 
(a  lower  category  of  ability).  We  did  this  by  comparing  the  quality  of  moves 
in  regulation  time  games  and  5-minute  games.  Increased  time  should  allow 
more  detailed  analyses  of  moves,  but  should  have  little  effect  on  the  move 
alternatives  first  recognized.  Both  masters  and  experts  showed  better 
performance  with  more  time,  indicating  that  they  are  both  using  calculational 
processes.  However,  the  superiority  of  the  masters  did  not  increase  with 
additional  time;  the  masters  were  better  for  the  5-minute  games,  and 
maintained  the  same  level  of  superiority  for  the  regulation  games.  This 
suggests  that  both  masters  and  experts  show  the  same  level  of  calculational 
skills,  and  supports  the  hypothesis  that  the  strength  of  the  chess  masters  is 
primarily  due  to  recognitional  capacity  --the  ability  to  recognize  the  strongest 


option. 


The  previous  work  established  the  relevance  of  recognitional 


capacities.  The  growth  of  such  discriminative  abilities  has  been  the 
focus  of  work  by  E.  Gibson  (1969).  We  hypothesized  that  more  proficient 
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subjects  have  learned  to  use  a  greater  number  of  dimensions  for 
perceiving  a  task.  This  was  studied  for  the  Cardlo-Pulmonary 
Resuscitation  (CPR)  skills  of  beginners,  Instructors,  and  paramedics 
(ten  subjects  In  each  group).  We  were  able  to  pinpoint  perceptual  dimensions 
that  could  be  used  by  paramedics,  but  not  by  lesser-skilled  personnel.  These 
differences  could  be  used  as  the  basis  for  defining  perceptual  training 
requirements  for  complex  tasks.  A  second  study  applied  the  same  paradigm 
to  computer  programming,  comparing  novices  (1-3  years  experience)  with 
programmers  (more  than  seven  years  of  experience).  Again,  clear  differences 
emerged  In  the  use  of  perceptual  dimensions. 

It  has  been  noted  that  chess  masters  can  recall  positions  from  a 
very  large  number  of  games  and  In  chess  analysis  a  position  is  characteristically 
compared  to  the  same  or  similar  positions  reached  in  previous  games  through 
recent  chess  history.  What  role  do  such  comparisons  play  in  the  proficiency 
of  chess  masters?  How  are  analogical  comparisons  made  and  used? 

We  studied  the  use  of  analogical  reasoning  for  generating  predictions 
in  technological  environments.  Three  models  of  analogical  reasoning 
were  considered  and  rejected.  The  first  Is  the  standard  a:b::c:d  model 
employed  by  test-makers.  The  second  stems  from  the  use  of  analogical 
reasoning  to  generate  new  scientific  hypotheses.  Seven  Air  Force  engineers 
were  interviewed;  all  had  used  comparison  cases  as  analogues  for  the  task 
of  predicting  reliability  of  subsystems  for  the  B-l.  Neither  model  was 
able  to  account  for  the  performance  of  the  engineers.  A  third  model  claims 
that  analogical  reasoning  is  based  on  similarity  matches,  and  is 
probabilistic.  This  model  was  rejected  on  conceptual  grounds  --  the  processes 
it  relies  on  are  Inadequate  for  the  task.  To  replace  these  models,  we  developed 
a  new  theory  of  analogical  reasoning,  showing  its  basis  In  standard  forms  of 


deductive  logic.  We  also  were  able  to  define  the  conditions  under  which 
analogical  reasoning  will  generate  formally  valid  conclusions.  This  work 
Is  relevant  for  any  area  In  which  comparisons  play  an  Important  role,  such  as 
the  domain  of  technological  Improvement  Involving  design  changes  In 
automobiles,  aircraft,  etc. 

The  research  performed  has  Implications  for  a  number  of  applied  areas, 
such  ^s  the  development  of  methods  for  generating  predictions  under  conditions 
of  uncertainty,  the  design  of  programs  for  training  personnel  to  reach  high 

levels  of  proficiency,  and  the  development  of  automated  decision  aids  to 

2  .  -  - . 

support  experienced  C  personnel. 


2.  Research  Objectives 

The  research  obejctlves  of  this  program  were  to  review  the 
literature,  perform  theoretical  analyses,  develop  research  paradigms,  and 
perform  empirical  research,  In  the  area  of  highly  proficient  performance. 

The  nature  of  highly  proficient  performance  has  recently  become 
the  subject  of  a  great  deal  of  theoretical  and  empirical  research.  There 
seems  to  be  general  agreement  that  novices  are  learning  Individual  steps, 
along  with  rules  for  when  to  perform  these  steps.  However,  highly 
proficient  performance  does  not  readily  display  characteristics  of  following 
steps,  or  rules.  It  has  not  been  demonstrated  that  the  behavior  of  experts 
can  be  defined  in  terms  of  computational  operations  on  formally  defined 
elements.  This  creates  a  challenge  for  the  Information  processing  approach 
to  model  highly  proficient  performance.  It  also  creates  an  opportunity  to 
examine  some  of  the  assumptions  underlying  the  Information  processing 
approach  In  psychology,  and  to  attempt  to  formulate  alternative  accounts  of 
expertise  that  do  not  rely  on  a  framework  of  computational  operations.  For 
applied  purposes,  the  development  of  improved  procedures  for  describing 
highly  proficient  performance  could  allow  more  effective  methods  for  selecting 
and  training  highly  competent  personnel.  Decision  making  is  one  example  of 
a  skill  where  our  understanding  of  highly  proficient  performance  can  have 
Important  Implications.  The  type  of  automated  decision  aids  we  can  develop 
should  be  a  function  of  the  needs  of  proficient  decision  makers,  rather  than 
of  the  state-of-the-art  in  the  relevant  microprocessor  technologies. 


3.  Problem  Statement 

Currently,  there  Is  no  adequate  theory  of  proficient  performance. 

The  thrust  of  psychological  research  on  learning  and  competence  until  recently 
has  been  directed  at  novices  acquiring  an  unfamiliar  skill.  However,  It 
seems  highly  unlikely  that  the  same  processes  will  account  for  the  difference 
between  experts  and  novices.  The  assumption  that  experts  are  simply  following 
the  same  procedures  as  novices  (except  that  the  experts  are  faster  and 
more  accurate)  has  not  received  empirical  support. 

Specifically,  It  may  not  be  reasonable  to  assume,  as  Information 
processing  accounts  do,  that  proficient  performance  of  a  skill  depends  on 
the  ability  to  break  tasks  down  Into  basic  elements,  to  apply  rules  and 
procedures  to  these  elements,  and  to  use  higher-level  rules  for  the  accomplish 
ment  of  simpler  rules.  As  we  have  discussed  elsewhere  (Klein,  1978),  the 
postulation  of  basic  elements  of  a  task  runs  into  the  difficulties  which 
led  to  the  abandonment  of  logical  atomism,  and  there  are  a  variety  of  reasons 
to  doubt  that  experts  are  applying  formal  operations  to  basic  elements, 
or  are  following  any  rules  or  procedures  at  all.  If  a  rule  ("If  X  occurs, 
do  Y  until  Z  occurs")  is  seen  as  the  basis  for  skilled  performance,  then 
how  does  an  expert  know  when  X  and  Z  have  occurred?  Higher  level  rules 
must  be  invoked  for  this  guidance,  and  still  higher  level  rules  are  needed  to 
guide  the  performance  of  hierarchy.  (See  Weitzenfeld,  1981a,  for  a 
fuller  account  of  this  problem.) 

The  work  of  Herbert  Simon  (Chase  and  Simon,  1973;  Larkin  et  al., 

1980)  offers  an  alternative  to  a  calculational  theory.  Simon  and  his  co¬ 
workers  have  emphasized  recognitlonal  capacities.  Their  research  raises 
the  question  of  what  role  Is  played  by  recognitlonal  vs.  calculational 
capacities,  for  highly  proficient  personnel. 
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4.  Research  Rationale 

The  general  goal  of  this  research  program  has  been  to  test  the 
hypothesis  that  recognitions!  capacities  account  for  expertise,  and  to 
extend  our  understanding  of  proficient  performance. 

We  have  developed  and  refined  a  theoretical  description  of  proficient 
performance  (Klein,  1980a)  based  on  recognitlonal  capacities.  This  description 
Is  currently  at  a  level  of  specificity  that  allows  empirical  testing. 

Basically,  our  account  contrasts  proficient  performers  with  novices,  In 
terms  of  recognitlonal  capacities,  perceptual  learning,  and  the  availability 
and  use  of  analogues. 

4.1  Recognitlonal  Capacity.  We  hypothesize  a  distinction  between  recognitlonal 
capacities  and  calculational  capacities.  Recognitlonal  capacities  allow  a 
proficient  performer  to  Immediately  recognize  specific  situations,  and  the 
relevance  of  goals  and  strategies.  These  place  no  strain  on  limited 
attentional  or  memory  resources.  Calculational  capacities  involve  the  use 
of  working  memory  to  examine  contingencies.  Using  chess  as  an  example, 
a  grandmaster  would  display  recognltional  capacities  in  perceiving  several 
pieces  as  one  unit,  or  chunk.  Chase  and  Simon  (1973)  have  estimated  that 
In  the  course  of  their  experience,  grandmasters  acquire  the  ability  to 
distinguish  between  approximately  50,000  patterns  of  pieces.  Larkin  et  al. 
(1980)  further  propose  that  the  ability  to  recognize  and  distinguish  between 
larger  sets  of  patterns  Is  basic  to  proficiency  development  In  a  variety  of 
domains.  We  suggest  that  the  recognition  of  patterns  is  accompanied  by 
a  recognition  of  the  types  of  reactions  that  are  plausible  in  response  to 
those  patterns.  Thus,  a  grandmaster  should  be  able  to  recognize  more 
plausible  moves  In  a  situation  than  would  a  lower-rated  player.  In  contrast, 
calculational  capacity  consists  of  the  ability  to  work  out  the  implications 
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of  actions.  Continuing  with  our  chess  example,  once  patterns  have  been 
recogniied*  and  plausible  moves  identified,  analysis  is  needed  in  order 
to  select  the  optimal  move,  This  analysis  Would  consist  of  the  examination 
of  moves,  counter-moves,  etc.  Above  a  certain  level  of  proficiency,  we  would 
expect  these  calculational  capacities  to  remain  constant.  The  only 
difference  to  expect  is  that  grandmasters  may  have  larger  "chunks"  to  deal 
with,  thus  freeing  more  of  their  working  memory  spade  for  deeper  analyses. 
Glaser  (1981)  has  discussed  the  Importance  of  recognitlonal  capacities  over  ! 
calculational  capacities  In  accounting  for  proficient  performance. 

4.2  Perceptual  Learning.  Our  second  hypothesis  Is  that  qualitative 
differences  in  task  perception  Influence  success.  Gibson  (1969)  has 
discussed  the  Importance  of  perceptual  learning  for  skills,  and  It  seems 
obvious  that  experts  can  make  distinctions  that  are  opaque  to  novices. 
However,  the  perceptual  learning  obtained  through  experience  may  not  simply 
result  in  smaller  jnd's  (just  noticeable  differences) .  We  speculate  that 
the  expert  has  also  learned  which  dimensions  to  use  in  examining  tasks.  That 
Is,  the  learning  necessary  for  proficiency  is  based  on  the  acquisition  of 
more  powerful  perceptual  dimensions,  as  well  as  the  ability  to  make  finer 
discriminations  along  these  dimensions. 

These  discriminations  may  reflect  a  greater  ability  to  differentiate 
between  appropriate  goals.  Sensitivity  to  overall  goals  leads  to  coordinated 
performance,  as  opposed  to  the  jerky  movements  of  novices  reacting  primarily 
to  Immediate  demands. 

4.3  Analogical  Reasoning.  A  third  hypothesis  is  that  specific  previous 
experiences  can  be  used  as  analogues  to  a  given  problem  situation,  acting 
as  an  efficient  means  of  bringing  a  large  amount  of  information  to  bear 


on  the  problem.  Someone  with  more  experience  will  have  available  a  wider 
range  of  analogues,  and  will  be  likely  to  have  available  an  analogue  that 
Is  directly  relevant  to  a  specific  task.  In  addition,  a  person  with  more 
experience  will  be  able  to  make  better  use  of  analogues  to  define  problems, 
generate  options,  anticipate  outcomes,  and  formulate  predictions. 

However,  It  Is  difficult  to  test  these  hypotheses  without  a 
comprehensive  theory  of  analogical  reasoning.  Accordingly,  much  of  our 
effort  In  this  domain  has  been  in  the  direction  of  developing  descriptive 
and  prescriptive  theories  of  analogical  reasoning.  The  logic  of  analogical 
reasoning  may  govern  the  use  of  schemas  and  prototypes;  It  may  serve  as 
the  basis  for  the  process  of  generating  new  hypotheses.  The  study  of  analogical 
reasoning  may  have  applied  value  If  It  can  help  us  to  understand  how  new 
situations  are  understood,  and  how  predictions  are  arrived  at. 

4.4  Summary.  We  are  assuming  that  expertise  develops  through  perceptual 
learning  rather  than  just  through  the  acquisition  of  rules  and  procedures. 
Experts  are  not  just  faster  and  more  accurate  at  applying  rules  and  higher 
level  rules.  Rather,  their  skill  is  based  on  the  fact  that  they  have  learned 
to  perceive  situations  differently.  They  can  perceive  larger  chuncks,  and  they 
can  recognize  overall  situations  and  relationships.  They  can  make  discrimin¬ 
ations  that  are  opaque  to  personnel  with  less  experience,  and  they  can  detect 
similarities  that  go  unnoticed  by  personnel  at  lower  skill  levels.  They 
have  acquired  a  wide  range  of  applicable  experiences.  Perhaps  most  Important 
of  all,  experts  appear  to  be  able  to  recognize  plausible  goals  in  situations. 
They  can  examine  a  situation  and  quickly  understand  what  sorts  of  outcomes 
are  worth  striving  for.  These  goals  appear  to  be  recognized  without  the  need 
for  calculations.  An  expert  simply  seems  to  be  able  to  recognize  what  out- 
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comes  to  expect,  and  what  goals  to  emphasize.  To  some  extent,  this 
recognition  may  be  based  on  perceived  similarity  of  the  current  situation 
to  other  analogous  situations,  or  to  prototypes  derived  from  several 
analogues.  Once  the  expert  has  identified  long-range  goals,  these  can  be 
used  to  structure  short-range  goals  and  plans.  Thus,  the  performance  of 
the  expert  appears  smooth  and  coordinated  because  actions  are  generally 
occurring  within  a  context  of  overall  goals.  In  contrast,  novices  are  usually 
reacting  to  local  conditions,  and  trying  to  respond  to  immediate  pressures. 
There  are  no  long-range  goals  to  integrate  their  performance. 

4.5  Research  Findings.  We  have  studied  recognitional  capacities., 
perceptual  learning,  and  analogical  reasoning. 

4.5.1  Recognitional  Capacity. 

Our  prediction  was  that  proficiency  at  a  task  depends  on  recognitional 
rather  than  calculational  capacities.  People  who  are  more  proficient  at 
a  task  appear  to  be  able  to  recognize  better  options  and  reactions,  and  this 
is  a  reason  for  their  performance  superiority.  This  prediction  was  tested 
in  a  study  of  chess  expertise  performed  in  collaboration  with  Stuart  and 
Bert  Dreyfus,  University  of  California  at  Berkeley  (who  had  been  funded  by 
the  Air  Force  Office  of  Scientific  Research  Grant  AFOSR-78-3594) . 

4. 5. 1.1  Subjects  were  three  chess  players  rated  as  senior 
masters,  and  three  players  rated  as  experts.  The  U.S.  Chess  Federation 
rates  players  on  the  basis  of  outcomes  of  games  and  tournaments.  A 
difference  of  200  rating  points  translates  into  a  75£  probability  of  winning 
a  game.  The  median  of  American  tournament  players  is  estimated  at  1400. 

Class  E  players  (beginners)  are  rated  less  than  1200.  Class  D  players 
are  rated  1200  -  1399.  Class  C  players  are  rated  1400  -  1599.  Class  B 
players  are  ,'easonably  strong,  and  rated  1600  -  1799.  Class  A  players  are 
very  strong,  and  are  rated  1800  -  1999.  The  next  step  above  Class  A  is  the 
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rating  of  expert,  2000  -  2199.  Masters  are  rated  2200  -  2399.  Senior  masters 
are  rated  above  2400.  International  grandmasters  are  rated  above  2500,  and  in 
addition  have  shown  a  certain  level  of  proficiency  while  playing  in  certified 
tournaments. 

In  this  study,  the  three  experts  were  rated  2062,  2130,  and  2150. 

The  three  masters  were  rated  2401,  2403,  and  2500.  In  addition,  we  used  two 
players  to  rate  the  moves  of  the  games  played  by  the  masters  and  experts. 

One  of  these  players  was  rated  2500+  (this  person  was  a  senior  master, 
lacking  only  tournament  credentials  to  be  considered  an  international 
grandmaster.  At  the  time  of  the  study,  he  had  tied  for  first  place  in  the 
U.S.  Chess  Championships  held  at  Stanford  in  the  summer  of  1981).  The  other 
rater  was  rated  at  2520  (international  grandmaster). 

4. 5. 1.2  Procedure.  Two  tournaments  were  arranged,  one  for 
the  experts  and  one  for  the  masters.  Each  tournament  consisted  of  a  double 
round-robin,  in  which  each  player  played  each  other  player  two  times,  once 
with  the  black  pieces  and  once  with  the  white  pieces.  Regulation  time 
of  50  moves  in  2  hours  was  used.  In  addition,  another  double  round-robin 
was  played  by  the  same  players  with  only  five  minutes  total  available  for 
each  player.  Thus,  condition  A  was  playing  skill,  master  vs.  expert,  and 
condition  B  was  time  available,  regulation  time  or  a  speeded  condition. 

For  each  set  of  players,  the  sequence  of  speeded  and  regulation  games 
was  counterbalanced  for  each  session.  Each  player  played  white  and  black 
an  equal  number  of  times  against  each  other  player,  and  began  a  set  of 
games  with  an  opponent  playing  white  and  black  an  equal  number  of  times. 

An  equal  number  of  sessions  began  with  the  5-minute  game  first  and  the 
regulation  time  game  first.  This  design  yielded  6  regulation  time  games 
and  6  speeded  games,  at  each  of  the  two  skill  levels. 


Incentives  were  provided  for  performance.  For  the  experts,  each 
player  was  paid  $2.50  for  each  game  played,  and  an  additional  $10  for  each 
game  won,  whether  regulation  time  or  speeded.  For  the  masters,  there  was 
$10  payment  for  playing  each  game,  and  an  additional  $30  for  each  game  won; 
in  addition,  the  results  of  each  regulation  game  were  presented  to  the  U.S. 
Chess  Federation,  to  be  taken  into  account  in  revising  ratings. 

For  each  game  played,  records  were  made  of  the  moves  played.  For  the 
regulation  games  this  was  done  during  the  games.  For  the  speeded  games 
the  moves  were  reconstructed  immediately  after  the  game.  A  research 
assistant  was  present  at  all  the  games  played  by  masters,  and  recorded 
moves  during  the  speeded  games.  In  addition,  times  were  recorded  along 
with  moves  during  the  games  for  the  six  regulation  gamer,  played  by  masters. 

After  the  games  were  played,  sheets  were  prepared  for  the  raters. 

Each  game  was  coded,  so  there  would  be  no  indication  of  whether  the  game 
was  regulation  or  speeded,  played  by  experts  or  masters. 

For  each  game  played,  moves  1-10  were  deleted  (since  we  weren't 
interested  in  studying  knowledge  of  book  openings),  and  a  diagram  was 
prepared  with  the  position  after  move  10.  Two  chess  players  with  ratings 
above  2500  were  paiu  to  rate  each  of  the  moves.  The  initial  ratings  were 
performed  independently,  but  subsequent  consultation  was  allowed  to  permit 
the  sharing  of  discoveries  about  strengths  and  weaknesses  of  specific  moves. 

Each  move  was  rated  on  two  scales.  First,  the  rater  assessed  the 
position  prior  to  the  move,  and  determined  whether  there  was  clearly  one 
best  move  in  that  situation,  or  whether  there  were  at  least  2-3  moves  to 
consider.  The  rationale  for  this  rating  was  that  the  skill  of  the  masters 
should  be  more  evident  in  more  complex  situations.  Second,  the  move 
selected  was  rated  on  a  5-point  scale.  The  anchors  for  the  scale  are  as 
follows:  5  (there  are  no  moves  better  than  this  one),  4  (playable,  but 
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not  the  best  move),  3  (dubious,  not  a  strong  move,  but  not  a  blunder), 

2  (a  positional  blunder,  threatening  the  loss  of  material  or  an  attack  on 
the  King),  and  1  (a  material  blunder,  leading  to  the  outright  loss  of  a 
piece).  Ratings  took  approximately  one  hour/game. 

4. 5. 1.3  Results.  Agreement  between  raters  was  high.  For 
the  first  scale  (clearly  one  best  move  in  a  situation  vs.  at  least  2-3 
alternatives)  there  was  92%  agreement.  For  the  second  scale,  quality  of 
moves, the  correlation  between  the  ratings  for  the  two  raters  was  approximately 
.84.  The  data  were  combined  for  the  two  raters  by  averaging  the  ratings  given 
to  each  move.  For  the  judgement  of  "clearly  one  best  move"  vs.  "at  least  2-3 
alternatives,"  we  only  used  the  cases  where  both  raters  were  in  agreement. 

The  average  game  contained  40  moves  by  each  player,  of  which  we  were 
able  to  use  99%.  The  overall  data  are  presented  in  Figure  1.  The  four  data 
points  in  Figure  1  each  represent  between  324  and  474  ratings.  The  rated  qual  ity  of 
moves  is  higher  for  the  masters  than  the  experts,  by  .14.  This  difference  was 
significant,  F ( 1 , 22 )  =  5.0,  p<.  .05.  The  difference  may  appear  small,  but  it 
should  be  remembered  that  this  is  the  average  dirference  per  move.  Projected 
over  a  series  of  7  moves,  it  would  result  in  a  master  making  one  move  rated 
as  "5"  while  an  expert  was  making  a  playable,  but  not  highest  quality  move, 
rated  "4."  Projected  over  a  game  of  30-60  moves,  the  difference  would  be 
sufficient  for  the  master  to  generally  win  games  if  matched  with  an  expert. 


The  average  time/move  in  regulation  games  was  2.5  minutes,  and  for 
speeded  games  it  was  approximately  6.0  seconds.  Figure  1  also  shows  that  the 
move  quality  was  higher  for  regulation  games  than  for  speeded  games, 

F ( 1 , 22 )  =  17.1,  p<.05.  This  supports  a  calculational  model,  in  which 
the  players  are  constructing  sequences  of  moves,  counter-moves,  counter¬ 
counter  moves,  and  so  on.  The  more  time  available  to  perform  such 
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analyses,  the  deeper  and  more  careful  the  analysis  can  be.  Both  the  master 
and  expert  show  Improvement  from  speeded  to  regulation  time  games. 

Figure  1  supports  a  mixed  model,  Including  recognltlonal  and 
calculatlonal  abilities.  If  the  superiority  of  the  masters  was  due  solely 
to  calculatlonal  skills,  then  this  superiority  would  be  more  strongly 
demonstrated  for  the  regulation  games  than  the  speeded  games.  However,  the 
trends  for  masters  and  experts  are  parallel.  They  both  show  the  same 
improvement  with  time.  These  data  do  not  support  a  model  claiming  that  the 
superiority  of  the  master  is  due  to  better  tree  construction  and  searching. 

The  data  do  support  the  type  of  recognitional  model  discussed  by 
Simon.  If  the  chess  master  can  recognize  better  moves  to  analyze,  we  would 
expect  this  difference  to  emerge  for  the  5-mirute  games,  as  It  does.  The 
data  are  consistent  with  a  model  of  chess  decision-making  in  which  a  finite, 
limited  number  o'  moves  are  recognized,  and  are  then  analyzed.  The  master 
can  recognize  higher  quality  moves,  but  is  not  superior  to  the  experts  at 
analyzing  the  moves  recognized.  Vius,  the  difference  that  appears  for  5- 
minute  games  remains  constant  for  regulation  games. 

These  data  also  support  deGroot’s  (1978)  observation  that  grandmasters 
could  recognize  and  select  the  best  move  in  a  difficult  chess 
problem  and  experts  rarely  even  considered  the  move  as  an  option. 

The  data  were  further  analyzed  into  situations  where  there  was 
clearly  one  best  move  (C),  vs.  situations  where  there  were  at  least  2-3 
good  moves  (2-3).* 

★ 

The  validity  of  this  distinction  is  supported  by  the  time  data  for 
regulation  games  of  masters.  The  mean  time  ‘taken  for  situations  where  there 
was  clearly  one  best  move,  was  1.68  minutes,  whereas  if  there  were  at  least 
2-3  alternatives,  the  time  taken  was  3.99  minutes.  The  difference  was 
significant,  F(1 ,23)  =  35.37,  p<.01. 


Figure  1:  Move  Quality  for  Masters  vs.  Experts 
for  Regulation  and  Speeded  Gaines 
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These  data  are  presented  in  Figures  2  and  3.  Several  findings  emerge 
from  examination  of  these  figures.  First,  the  parallel  trends  for  masters 
and  experts  becomes  even  more  pronounced.  For  Figure  2,  the  difference  in 
move  quality  between  masters  and  experts  Is  .08  for  regulation  games,  and 
.10  for  speeded  games.  In  Figure  3,  the  difference  between  masters  and 
experts  In  move  quality  was  .31  for  regulation  games  and  .26  for  speeded 
games.  None  of  these  data  can  be  Interpreted  as  showing  the  masters  are 
superior  at  calculations  or  analysis,  compared  to  experts. 

A  second  finding  Is  that  the  difference  between  masters  and  experts 
is  more  pronounced  when  there  is  not  any  clearly  best  move.  The  slope  of 
the  lines  in  Figure  3  are  also  flatter  than  In  Figure  2. 

Since  masters  show  a  greater  superiority  to  experts  In  complex 
situations  (2-3  alternatives)  than  in  simple  situations  (one  best  move),  we 
would  expect  that  this  would  affect  the  strategies  used  by  each  type  of 
player.  Figure  4  shows  that  under  speeded  conditions,  both  masters  and 
experts  show  the  same  proportion  of  cases  for  which  there  are  2-3  options: 

42*  of  the  total  moves.  Given  enough  time  to  shape  strategy,  masters  reduce 
their  proportion  to  36*.  However,  the  experts  reduce  the  proportion  to  only 
18*.  the  experts  seem  to  be  trying  to  maximize  the  role  of  their  calculat  ional 
skills,  and  minimize  their  reliance  on  recognitional  skills.  They  are 
simplifying  their  games.  It  would  be  interesting  to  compare  these 
proportions  for  games  in  which  masters  and  experts  faced  each  other. 

Figure  4  also  explains  why  the  trends  in  Figure  1  are  not  as  parallel 
as  those  in  Figures  2  and  3.  The  experts  were  playing  more  simplified 
regulation  time  games  than  the  masters  (In  more  than  80*  of  the  cases  there 
was  one  clearly  best  move),  and  the  average  rating  for  such  moves  was  higher 


Figure  2:  Move  Quality  for  Situations  Where 
There  is  Clearly  One  Best  Move 
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than  for  situations  with  2-3  alternatives.  Therefore,  the  data  for  expert 

moves  in  regulation  games  ere  artificially  Inflated  In  Figure  1.  Figures 

2  and  3  maintain  the  distinction  between  the  C  condition  and  the  2-3 

condition,  and  are  a  more  accurate  reflection  of  performance, 

The  data  should  not  be  Interpreted  to  mean  that  skill  level  Is 

never  related  to  calculations!  skills.  We  only  examined  two  neighboring 

classes  of  skills,  both  at  very  high  levels.  We  would  expect  that  at  lower 

skill  levels,  calculatlonal  capacities  would  emerge  as  a  differentiating 
* 

factor. 

4. 5. 1.4  Implications.  The  data  support  the  Importance  of 
recognltlonal  capacities  for  highly  proficient  performance.  This  suggests 
that  It  may  be  more  fruitful  to  study  high  levels  of  proficiency  In  terms  of 
perceptual  learning  models  than  In  terms  of  tree-searching,  calculatlonal 

t 

models. 

The  data  have  Implications  for  training.  The  training  of  recognltlonal 
capacities  needs  to  be  examined  If  we  are  to  be  able  to  use  training  programs 
to  develop  high  levels  of  expertise.  Simon  and  Chase,  and  Larkin  et  al. 
claim  that  such  capacities  are  developed  only  after  thousands  of  hours  of 
practice.  Highly  proficient  subjects  appear  to  be  able  to  distinguish  between 
50,000  different  patterns.  The  pattern  vocabulary  for  good  club  chess  players 
Is  only  about  1,000  patterns.  Novices  can  only  recognize  a  few  patterns. 

One  challenge  Is  to  be  able  to  expand  the  recognltlonal  pattern  vocabulary 
more  efficiently.  In  chess  this  might  be  developed  in  beginners  by  developing 

*  However,  It  is  difficult  to  obtain  ratings  for  moves  at  low  levels  of  play, 
for  several  reasons.  The  ratings  would  be  biased,  since  it  would  be  clear 
to  raters  that  skill  levels  were  markedly  different.  In  addition,  it  would  be 
difficult  to  rate  moves,  since  a  given  move  might  be  a  blunder  committed 
by  a  Class  C  player,  or  a  clever  tactic  played  by  an  expert.  The  rater  would 
have  to  see  how  the  move  was  followed  up  In  order  to  determine  how  much 
strength  to  read  Into  It,  and  this  would  complicate  the  ratings. 


training  materials  consisting  of  games  played  by  high  level  players. 

The  task  would  be  to  predict  which  moves  In  each  situation  were  considered 
by  the  players;  these  predictions  would  be  matched  against  an  actual 
listing  of  the  moves  that  the  players  did  consider.  Such  training  could 
facilitate  the  ability  to  recognize  tho  types  of  moves  to  be  examined. 

The  data  also  have  Implications  for  the  design  of  decision  aids. 

Two  Inferences  are  made.  The  first  Is  that  In  order  to  help  experts  play 
like  masters,  they  will  need  to  have  a  better  set  of  Initial  moves  to  consider. 
Second,  since  both  masters  and  experts  are  relying  on  calculational  skills,  a 
decision  aid  that  allowed  the  player  to  enter  Initial  moves  (and  counter¬ 
moves)  and  then  performed  the  subsequent  tree  construction  and  search,  might 
be  of  benefit.  The  Interaction  would  consist  of  the  operator  continually 
pruning  the  decision  tree  by  rejecting  poor  lines  of  play,  and  emphasizing 
promising  lines  for  deeper  analysis. 

Finally,  the  results  have  Implications  for  workload  assessment. 

When  the  average  time/move  Is  reduced  from  2.5  minutes  to  6  seconds,  the 
quality  of  move  Is  reduced  by  only  a  small  amount,  and  is  still  reasonably 
high.  For  the  speeded  games,  the  average  move  generated  by  experts  was  still 
rated  above  the  level  of  playable.  This  suggests  that  for  tasks  that 
Involve  recognltlonal  capacities,  and  are  performed  by  proficient  personnel, 
time  pressure  may  not  have  an  overwhelming  effect  on  performance.  It  must 
be  remembered  that  this  holds,  in  the  present  experiment,  for  players  rated 
as  experts,  who  are  far  Inferior  to  grandmasters. 

In  fact,  It  could  be  argued  that  the  skill  of  the  players  studied 
is  primarily  recognltlonal,  rather  than  calculational.  This  argument,  which 
depends  on  several  tenuous  assumptions,  runs  as  follows:  If  we  assume  that 
recognltlonal  capacities  are  manifested  within  the  first  few  seconds,  and 
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not  thereafter  (which  Is  unlikely;  evaluation  of  moves  depends  on  recognition) , 
then  jm  of  the  Improvement  between  the  speeded  and  the  regulation  games  Is 
due  to  calculations.  This  Is  not  very  much  Improvement.  Further,  If  we 
assume  that  only  a  minimal  amount  of  calculations  can  occur  within  6  seconds, 
then  we  might  conclude  that  most  of  the  skill  depends  on  recognition, 
because  very  high  quality  moves  were  being  generated  within  a  very  short  time. 
Proponents  of  a  calculatlonal  model  would  have  to  show  that  moderately  good 
chess  players  are  able  to  perform  the  analyses  necessary  to  generate  playable 

moves  In  only  a  few  seconds.  The  burden  of  proof  Is  on  such  proponents. 

Of  course.  It  must  be  noted  that  the  paradigm  we  used  did  not  control 
time  available.  For  speeded  games,  the  average  time  available  was  approximately 
6-7  seconds.  Subjects  were  undoubtedly  using  more  time  for  more  complex 
situations.  They  were  also  using  analyses  developed  during  prior  moves,  plus 
analyses  performed  during  the  opponent's  turn.  A  design  that  provided  better 

controls  on  Jme  would  be  a  next  step  for  this  research. 

4.5.2  Perceptual  Learning.  The  primary  method  we  have  developed 

for  contrasting  the  perceptual  abilities  of  experts  vs.  novices  Is  based  on 
similarity  and  difference  judgements  (Galanter,  1956;  Fransella  &  Bannister, 
1977).  The  paradigm  has  three  stages:  (a)  the  selection  of  representative 
examples;  (b)  the  elicitation  of  similarity/difference  judgements,  using 
those  materials,  to  identify  dlmenlsons  of  analysis;  (c)  presentation  of 
rating  scales  to  subjects  at  different  levels  of  competence,  to  identify 
commonalities  In  the  use  of  some  dimensions,  and  to  highlight  dimensions 
that  are  used  differentially.  This  paradigm  allows  us  to  determine  perceptual 
differences  between  an  expert  and  a  novice.  It  allows  us  to  measure  processes 
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like  the  perceptual  learning  discussed  by  Eleanor  Gibson  (1969)  for  finding 
the  relevant  Information  In  a  situation.  We  think  that  experts  use  different 
discriminative  dimensions  than  novices.  This  prediction  was  tested  In  a  study 
that  compared  Cardlo-Pulmonary  Resuscitation  (CPR)  performance  of  students, 

t 

CPR  Instructors,  and  paramedics. 

4.5.2. 1  Method 

4. 5. 2. 1.1  Subjects.  Three  groups  of  subjects  were  used  In  this 
study:  students,  Instructors  and  paramedics.  The  students  were  adults 
who  had  taken  a  CPR  training  program  (eight  hours)  and  had  received 
certification  for  successful  completion.  The  Instructors  had  completed  the 
CPR  training  program  as  well  as  an  additional  training  program  for  Instructors. 
Each  instructor  had  received  Instructor  certification  at  completion,  and 

had  taught  CPR  to  novices.  No  Instructor  in  this  study  had  ever  actually 
performed  CPR  on  a  victim.  The  paramedics  were  trained  In  CPR  and  had 
experience  using  CPR  with  victims  as  part  of  their  work.  None  of  the 
paramedics  \'n  this  study  had  been  Involved  In  presenting  CPR  instruction. 

4. 5. 2. 1.2  Materials.  The  study  used  videotapes  and  test  booklets. 
There  were  six  different  videotapes  each  showing  a  person  (exemplar)  doing 
cardlo-pulmonary  checks  and  performing  CPR  on  a  ResusclAnne  training 
simulator. 

The  test  booklet  presented  the  judgement  dimensions  that  had  been 
derived  during  an  earlier  study  of  CPR  expertise  and  It  provided  a  place 
for  the  subjects  to  enter  a  rating  for  each  videotaped  performance  for  each 
of  13  dimensions.  The  subjects  also  Indicated  which  of  the  six  people  they 
would  choose  to  save  their  own  life  In  an  emergency. 

4. 5. 2. 1.3  P'-cccuure.  Each  subject  saw  videotape  presentations  of 
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the  six  exemplars.  The  task  was  to  rate  each  of  the  six  performances 
on  each  of  the  13  dimensions.  For  each  dimension,  both  ends  of  a  scale  were 
described,  e.g.,  "Smooth"  and  "Jerky."  One  end  of  the  scale  was  to  be 
described  as  "1"  and  the  other  as  "5."  Each  videotaped  performance  was 
to  be  rated  between  "1"  and  "5"  for  each  of  the  dimensions. 

4. 5.2.2  Results 

How  did  each  of  the  three  groups  Judge  exemplar  skill? 

All  subjects  were  asked  which  exemplar  they  would  choose  to  save 
their  own  life  in  ari  emergency.  (The  six  tapes  showed  five  students  and  one 
paramedic.)  In  the  paramedic  group,  9/10  subjects  selected  the  CPR 
performance  of  the  paramedic  exemplar.  The  paramedic  exemplar  was  selected  by 
5/10  students,  and  by  only  3/10  instructors.  The  instructors  were  concerned 
that  the  paramedic  wasn't  following  the  procedures  they  taught  in  their 
courses. 

The  judgement  pattern  on  individual  dimensions  was  consistent  with 
this  finding.  The  paramedic  group  judged  the  performance  of  the  paramedic 
exemplar  highest  on  12  of  the  13  dimensions  ("hand  placement"  was  the  only 
exception).  The  student  group  judged  the  paramedic  exemplar  highest  on  only 
6  dimensions:  "smoothness,"  "compressions  simulate  heart  action,"  "efficient.," 
"compression  time  correct,"  "confident,"  and  "performance  reflects  an 
understanding  of  how  the  body  works."  The  instructor  group  judged  the  paramedic 
exemplar  highest  on  only  4  dimensions:  "smooth,"  "adequate  breath  check," 
"correct  pulse  assessment,"  and  "performance  reflects  an  understanding  of  how 
the  body  works." 

How  do  the  groups  differ  in  their  use  of  dimensions? 

In  general,  the  paramedics  were  able  to  use  all  the  13  dimensions 
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to  discriminate  between  the  performance  of  the  exemplars,  whereas  the 
Instructors  and  novices  had  difficulties  In  using  several  of  the  dimensions. 

Differences  between  groups  taken  two  at  a  time  were  examined  for 
each  of  the  13  dimensions.  Table  1  presents  £  levels  of  significant 
single  discriminant  functions.  For  student  vs.  paramedic  judgements, 
the  single  discriminant  functions  were  significant  for  six  dimensions. 

Students  and  Instructors  could  be  significantly  distinguished  with  the  single 
discriminant  functions  for  ten  of  the  dimensions.  Finally,  the  judgement 
of  the  instructor  and  paramedic  groups  could  be  distinguished  on  nine 
dimensions.  As  can  be  seen  In  Table  1,  there  were  significantly  different 
patterns  of  judgements  between  all  three  groups  for  five  of  the  13  dimensions. 

Differences  were  found  between  groups  in  the  use  of  specific 
dimensions.  An  example  of  a  large  difference  in  the  use  of  a  dimension  is 
shown  in  Figure  5.  Figure  5  presents  the  performance  patterns  for  instructors 
and  paramedics,  for  the  dimension  of  "efficiency."  The  mean  rating  for  each 
exemplar  is  given,  along  with  the  band  size  of  one  standard  deviation. 

Exemplar  A  is  the  videotape  for  the  paramedic;  the  other  five  exemplars 
are  videotapes  of  students  performing  CPR.  Figure  5  shows  how  the  paramedics 
could  use  this  dimension  to  distinguish  the  paramedic  from  the  students, 
whereas  the  instructors  were  primarily  distinguishing  between  students. 

4. 5.2. 3  Discussion 

The  results  support  the  hypothesis  that  personnel  at  different 
skill  levels  will  show  differential  use  of  the  same  dimensions  In  perceiving 
performance  of  a  task.  The  paradigm  that  was  used  Is  capable  of  showing  which 
dimensions  were  consistently  used  by  which  group  of  subjects,  as  well  as 
Identifying  cases  in  which  two  groups  of  subjects  were  both  using  a 
dimension,  but  In  different  ways. 


TABLE  1 


Significant  levels  for  single  discriminant  functions, 
for  the  three  pairs  of  groups 


DIMENSION  Student/  Instructor/  Instructor/ 

Paramedic  Student  Paramedic 


1 .  Smooth- Jerky 

2.  Compressions  simulate  heart 
action  -  compressions  fail  to 
simulate  heart  action 

3.  Compressions  timing  correct  - 
timing  incorrect 

4.  Dangerous  -  Effective 

5.  Checks  cues  (monitor)  - 
fails  to  check  cues 

6.  Body  position  over  victim  - 
body  position  at  right  angle 

7.  Hand  placement  acceptable  - 
hand  placement  unacceptable 

8.  Adequate  breath  check  - 
inadequate  breath  check 

9.  Correct  pulse  assessment  - 
incorrect  pulse  assessment 

10.  Efficient  -  inefficient 

11.  Confused  -  confident 

12.  Compression  depth  correct  - 
compression  depth  incorrect 

13.  Performance  reflects  under¬ 
standing  of  how  body  works  - 
performance  reflects  ignorance 
of  how  body  works 
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Figure  5:  Instructors  vs.  paramedics  in  the 
rating  of  efficiency 
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The  research  paradigm  can  be  used  to  make  decisions  about  training 
requirements.  This  Is  especially  Important  at  higher  skill  levels,  where 
proficiency  depends  on  the  way  people  have  learned  to  perceive  situations, 
rather  than  on  rules  and  procedures.  The  results  of  Table  1  can  be  used 
to  Identify  training  requirements  for  instructors,  to  help  them  learn  to 
perceive  the  CPR  task  more  like  the  paramedics  do  For  nine  of  the  thirteen 
dimensions,  there  was  a  significant  difference  In  the  way  Instructors  and 
paramedics  perceived  the  exemplar  performance.  Three  of  these  differences 
were  significant  at  the  .01  level. 

The  general  strategy  for  identifying  training  requirements  Is  as 
follows:  for  any  given  dimension,  are  the  more  skilled  subjects  using 
that  dimension?  If  not,  then  It  does  not  require  training.  If  so,  then 
we  must  see  If  the  less  skilled  subjects  are  also  using  the  dimension.  If 
not,  then  It  Is  a  training  requirement.  If  they  are,  but  not  In  the  same 
way  as  the  more  highly  skilled  subjects,  then  it  Is  also  a  training 
requirement  and  the  differences  can  be  called  to  their  attention  as  a  training 
method. 

In  addition  to  Identifying  draining  requirements  for  more  proficient 
personnel,  the  analysis  of  perceptual  dimensions  may  also  have  some  value 
in  evaluating  training  progress,  and  in  supporting  personnel  selection 
decisions.  Applicants  can  be  matched  to  existing  group  profiles,  to  see 
which  group  Is  most  closely  matched  by  their  perceptual  judgements. 

The  research  paradigm  appears  to  have  general  applicability  to 
domains  In  which  there  Is  a  contrast  between  novice  and  proficient  performance. 
It  Is  currently  being  used  to  study  computer  programming  skills  (Peio,  1981). 
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Computer  software  expert*  (more  than  seven  years  experience)  were 
contrasted  with  novices  w\io  had  completed  high  levels  of  training  or  had 
one  to  three  years  experience  In  the  field.  As  In  the  CPR  study,  relevant 
task  dimensions  were  elicited  and  subjects  ranked  task  exemplars  on  the 
basis  of  those  dimensions.  I 

\ 

The  task  selected  wa*  the  evaluation  of  algorithms  for  a  particular 
programming  task.  A  varlet^  of  algorithms  for  a  classical  computer  programming 
task,  that  of  the  shortest  plath  problem  (critical  path  analysis)  were  chosen 
for  the  study.  These  algorithms  differed  In  a  variety  of  ways,  and  using  the 
matching  technique  ten  dimensions  were  elicited.  Subjects  were  then  asked 
to  examine  these  exemplars  and  rank  them  on  the  basis  of  how  the  dimensions 
apply  to  each  one. 

A  discriminant  analysis  again  revealed  significant  patterns  of 
differences  between  groups  of  experts  and  novices  In  the  use  of  perceptual 
dimensions.  These  dimensions  successfully  separated  novice  and  expert  groups 
on  seven  of  the  ten  dimensions  ,p<. 05.  Furthermore,  the  discriminant  analysis 
revealed  which  individuals  were  correctly  classified  as  experts  or 
novices  solely  on  the  basis  of  scores  on  those  dimensions.  These  predictions 
ranged  from  75%  to  95%  correct  on  the  seven  significant  dimensions.  In 
four  of  the  most  significant  dimensions  (over  85%  correct  classifications) 
the  same  novice  programmer  accounted  for  5%  of  incorrect  classifications 
of  group  members.  It  was  later  found  that  this  novice  was  experienced  with 
this  particular  programming  task.  An  additional  finding  was  that  experts 
were  able  to  use  more  of  the  dimensions  than  the  novices  In  discriminating 
between  algorithms,  providing  further  support  for  the  usefulness  of  this 
paradigm  In  determining  training  requirements  and  In  the  Identification  of 
proficient  personnel.  (See  Table  2.) 


TABLE  2 

Group  Significance  of  Dimensions 


Dimensions 

Independence  of  Computer 
Strengths 

Readability 

Wrftablllty 

Subpaths 

Storage 

Language 

Execution  Time 

Validation 

Nodes 

Calculates  Paths 


Significance 


Percent 

Correct  l  y  Class  If 


**  0.008 

95.  OX 

**  0.001 

90. OX 

**  0.005 

S9.5X 

**  0.002 

75. OX 

*  0.015 

85.  OX 

*  0.036 

80.  OX 

0.070 

70.  OX 

*  0.047 

80.  OX 

0.077 

65. OX 

0.165 

70. OX 

*  p«4.05 
**  p*.01 
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4.5.3  Analogical  Reasoning.  Our  goals  In  this  domain  have  been 
to  develop  descriptive  and  prescriptive  models  of  analogical  reasoning 
within  the  context  of  generating  predictions  In  technological  environments. 

A  large  body  of  previous  work  (e.g.,  Sternberg,  1977)  on  analogy 
has  focused  on  the  four-term  judgement  format,  a:b::c:d.  This  format 
Is  very  coninon  in  educational  measurement.  We  feel  that  no  existing  model 
of  this  format  can  be  applied  to  any  other  reasoning  by  analogy,  since 
the  four-part  analogy  problem  does  not  require  subjects  to  Identify  or  select 
analogues,  or  use  analogues  to  solve  problems.  The  four-part  analogy 
problem  primarily  tests  a  subject's  ability  to  recognize  analytical  and 
cultural  factors  that  make  certain  types  of  similarity  more  relevant  than  others 

We  felt  that  the  analysis  of  analogical  reasoning  presented  by 
philosophers  of  science  (e.g.,  Hesse,  1966;  Kuhn,  1962)  would  be  more 
applicable  to  technological  environments.  These  arguments  were  presented 
by  Weltzenfeld  and  Klein  (1979). 

We  tested  the  two  approaches  (four-part  analogy  problem  vs. 
philosophy  of  science  model)  in  a  study  with  Air  Force  engineers.  We 
Interviewed  seven  engineers  who  had  participated  in  an  effort  to  predict 
the  reliability  of  subsystems  for  the  B-l  aircraft.  The  method  they  use, 
comparability  analysis,  consisted  of  comparing  analogous  subsystems  on 
aircraft  currently  in  operational  use.  Essentially,  they  were  reasoning 
by  analogy  in  a  technological  domain.  Our  Interviews  attempted  to  learn  how 
they  were  doing  this.  Our  results  (Klein  and  Weltzenfeld,  1980)  did  not 
support  either  of  the  two  approaches  we  were  testing.  We  found  that  the 
Sternberg  model  simply  was  not  relevant  for  the  main  activities  of  the 
engineers:  selecting,  rejecting,  modifying,  and  using  comparison  cases. 

However,  the  philosophy  of  science  model  was  also  Inadequate,  because  the 
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concerns  of  science  ere  different  from  tho  e  of  technology.  The 
scientist  wishes  to  Identify  new  hypotheses  to  test;  accordingly  the 
philosophy  of  science  model  emphasized  the  ambiguous  features  of  the  two 
analogues,  (features  which  were  not  clearly  similarities  or  differences) 
and  regarded  them  as  sources  of  new  hypotheses.  The  goal  In  technology  Is 
to  predict  specific  Items  of  Information,  not  to  discover  new  hypotheses. 
Therefore,  the  engineers  were  attempting  to  Identify  comparison  cases  that 
allowed  them  to  predict  the  reliability  of  specific  B-l  subsystems. 

By  examining  their  strategies,  we  obtained  a  clearer  understanding 
of  the  task  of  technological  prediction.  The  use  of  analogies  or  comparison 
cases  has  always  seemed  risky  In  such  situations,  because  the  general  feeling 
has  been  that  the  force  of  the  comparison  Is  based  on  the  extent  of  similarity 
between  the  target  and  comparison  domain,  and  Is  therefore  probabilistic. 

Our  attention  was  turned  to  the  rational  basis  for  reasoning  by  analogy. 

We  found  that  there  was  no  sound  basis  for  drawing  Inferences  on  the  grounds 
of  degree  of  similarity  (Weltzenfeld,  1980).  We  do  not  think  people  actually 
reason  that  way.  Instead,  we  believe  that  analogical  reasoning  Is  based  on 
deductive,  rather  than  probabilistic,  reasoning. 

Weltzenfeld  (1981)  has  been  able  to  define  the  necessary  conditions 
for  obtaining  valid  Inferences  using  analogical  reasoning.  This  paper  Is 
important  for  several  reasons.  First,  Its  new  deductive  rationale  for 
analogical  reasoning  Is  radically  different  from  previous  accounts.  (It 
bears  some  resemblance  to  work  In  the  cognitive  sciences,  but  none  to 
philosophical  or  psychological  models.)  It  explains  both  why  reasoning  by 
analogy  Is  so  common  (It  can  be  as  valid  as  deductive  reasoning)  and  why  It 
falls  so  often  (It  requires  premisses  that  are  hard  to  establish). 

Second,  by  specifying  the  conditions  under  which  such  reasoning  Is 
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valid  It  raises  cautions  about  Inferences  from  comparisons  that  do  not 
satisfy  these  presuppositions.  Third,  It  Is  the  basis  for  a  prescriptive 
account  of  how  to  go  about  using  analogies.  We  are  now  applying  It  to 
problems  of  training  device  design.  Fourth,  It  can  be  the  basis  for  descriptive 
theories  of  human  use  of  analogues.  The  data  we  elicited  from  engineers  fall 
Into  place  when  Interpreted  as  Intuitive  applications  of  this  model. 

We  think  the  study  will  be  Important  in  the  philosophy  of  science 
and  the  psychology  of  analogy,  as  well  as  being  a  normative  tool.  It  shows 
that  the  use  of  comparisons  by  experts  must  be  a  more  complex  process  than 
has  been  thought  If  It  is  to  lead  to  valla  conclusions. 

The  new  account  of  reasoning  by  analogy  Is  based  upon  Identities  of 
structure  among  systems.  It  discusses  the  variety  of  forms  of  structure  and 
their  relative  stability.  It  provides  a  formal  definition  of  structural 
Identity  that  avoids  difficulties  encountered  by  previous  definitions.  It 
discusses  ways  of  discovering  the  existence  of  such  Identities  and  shows  how 
they  license  different  Inferences.  Among  the  conclusions  Is  the  central 
methodological  rule  for  analogy;  select  analogues  to  match  on  variables 
that  are  not  understood  and  then  correct  for  the  differences  that  you  do 
understand. 

All  of  this  work,  Including  the  Initial  model,  the  research  with  the 
engineers,  the  analysis  of  similarity,  and  the  prescriptive  model,  was 
supported  under  the  present  contract.  This  work  Is  currently  being  continued 
In  efforts  funded  by  the  U.S.  Army  Research  Institute,  to  predict  the 
training  effectiveness  of  new  training  devices. 

4.5.4  Decision  Making.  Our  work  In  decision  making  was  essentially 
an  outgrowth  of  our  research  Into  proficient  performance.  The  domain  of 

o 

Interest  was  tactical  C  decision  making.  Our  hypothesis  Is  that  expertise 
at  such  decision  making  depends  on  recognltlonal  capacities  and  analogical 


reasoning,  developed  through  many  hours  of  experience.  We  are  skeptical 
of  calculations!  approaches  to  such  decision  making.  Therefore,  we  were 
Interested  In  attempts  to  develop  automated  decision  aids  In  this  area. 

We  feel  that  such  aids  can  represent  slgnflcar.t  Improvements  In  efficiency 
and  performance.  However,  we  are  concerned  that  such  aids  could  diminish 
the  performance  of  proficient  decision  makers  If  the  aids  are  based  upon  a 
calculatlonal  model  of  proficiency  and  so  prevent  the  skilled  decision  maker 
from  utilizing  recognitions!  capacities.  These  Issues  were  described  in  a 
working  paper  (Klein,  1978)  and  were  presented  at  two  conferences 
(Klein,  1980b;  1981). 


34 


5.  Research  Applications.  In  addition  to  providing  a  test  of  a  general 
theory  of  proficient  performance,  the  experiments  described  above  may 
have  Implications  for  a  variety  of  applied  Issues. 

5.1  Decision  Making.  The  general  area  of  the  development  of  automated 

9 

decision  aids  for  tactical  C  presupposes  that  complex  tasks  can  be 

divided  Into  basic  elements,  and  that  decision  analytic  procedures  can  be 

applied  as  formal  operations,  to  provide  guidance  for  command  battle  managers. 

However,  the  perceptual/recognltlonal  description  of  proficient  performance 

that  we  have  been  developing  raises  some  questions  about. such  work  (Klein,  1980b). 

♦ 

We  have  attempted  to  use  this  work  to  derive  guidelines  for  the  allocation 
of  decisions  within  the  human-system  Interface,  in  a  way  that  Is  consistent 
with  the  skills  of  the  experienced  operator. 

5.2  Predictive  Logic.  Requirements  to  predict  reliability  of  sub¬ 
components  of  new  aircraft,  or  to  predict  training  effectiveness  of  new 
simulation  devices,  seem  to  be  based  on  reasoning  by  analogy.  As  we 
gain  a  clearer  understanding  of  how  people  Identify  and  use  analogues, 
we  should  be  able  to  provide  more  specific  guidance  for  tasks  requiring 
predictions.  The  analysis  developed  by  Weltzenfeld  (1981b)  presents  a 
prescriptive  model  for  the  activities  required  In  order  to  ensure  valid 
predictive  capabilities. 

5.3  Training  Requirements.  Simple  rule-based  descriptions  of  tasks  are 
usually  sufficient  for  defining  training  requirements  when  dealing  with  novices, 
and  with  procedural  tasks.  However,  when  dealing  with  non-procedural  tasks, 
and  with  developing  higher  levels  of  proficiency,  new  methods  are  needed 

for  Identifying  training  requirements.  The  research  reported  should  be 
valuable  In  this  effort,  by  demonstrating  the  relevance  of  perceptual 
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learning,  goal  frameworks,  and  analogical  reasoning  for  capturing  the  basis 
for  proficient  performance.  This  offers  the  possibility  of  deriving  training 
requirement  analyses  using  methods  for  gauging  the  perceptual  dimensions  used 
by  trainees,  the  sophistication  of  their  recognltlonal  capacities,  and  the 
types  of  analogues  that  they  have  available  for  use. 

5.4  Workload.  Our  research  with  proficient  chess  players  demonstrated  their 
ability  to  maintain  competent  performance  under  extreme  time  pressures. 
Presumably,  recognltlonal  capacities  are  not  as  affected  by  limitations  on 
working  memory  as  calculatlonal  capacities.  This  raises  questions  about 
the  ability  to  use  recognltlonal  strategies  to  overcome  workload  requirements. 
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presented  at  APA  convention.  New  York  City,  1979. 

(b)  Presentation  on  analogical  reasoning;  Air  Force  Human  Resources 

Laboratory,  Wright-Patterson  AFB,  Ohio;  July,  1979. 

(c)  Klein,  G.A.  Automated  aids  for  the  proficient  decision  maker. 

Paper  presented  at  IEEE  Conference,  Boston,  MA,  1980. 

(d)  Klein,  G.  A.  A  perceptual/recognitional  model  of  decision 

making.  Paper  presented  at  Summer  Computer  Simulation 
Conference,  Washington,  DC,  July,  1981. 

(e)  Presentation  or  highly  proficient  performance;  Aerospace 

Medical  Research  Laboratory,  Wright-Patterson  AFB,  Ohio;  July,  1981 
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(f)  Klein,  6.  A.,  &  Klein,  H.  A.  Perceptual/cognitive  analysis  of 

proficient  cardio-pulmonary  resuscitation  (CPR)  performance. 
Paper  presented  at  the  Midwestern  Psychological  Association 
meetings,  Detroit,  Michigan,  1981. 

(g)  Weitzenfeld,  J.  Knowledge  -  how  and  natural  kinds  in  psychology. 

Pap^r  presented  at  the  New  Jersey  Regional  Philosophical 
Association,  Princeton,  1981. 


